4 research outputs found

    How playlist evaluation compares to track evaluations in music recommender systems

    Get PDF
    \u3cp\u3eMost recommendation evaluations in music domain are focused on algorithmic performance: how a recommendation algorithm could predict a user's liking of an individual track. However, individual track rating might not fully reflect the user's liking of the whole recommendation list. Previous work has shown that subjective measures such as perceived diversity and familiarity of the recommendations, as well as the peak-end effect can influence the user's overall (holistic) evaluation of the list. In this study, we investigate how individual track evaluation compares to holistic playlist evaluation in music recommender systems, especially how playlist attractiveness is related to individual track rating and other subjective measures (perceived diversity) or objective measures (objective familiarity, peak-end effect and occurrence of good recommendations in the list). We explore this relation using a within-subjects online user experiment, in which recommendations for each condition are generated by different algorithms. We found that individual track ratings can not fully predict playlist evaluations, as other factors such as perceived diversity and recommendation approaches can influence playlist attractiveness to a larger extent. In addition, inclusion of the highest and last track rating (peak-end) is equally good in predicting playlist attractiveness as the inclusion of all track evaluations. Our results imply that it is important to consider which evaluation metric to use when evaluating recommendation approaches.\u3c/p\u3

    Improving understandability of feature contributions in model-agnostic explainable AI tools

    Get PDF
    Model-agnostic explainable AI tools explain their predictions by means of ’local’ feature contributions. We empirically investigate two potential improvements over current approaches. The first one is to always present feature contributions in terms of the contribution to the outcome that is perceived as positive by the user (“positive framing”). The second one is to add “semantic labeling”, that explains the directionality of each feature contribution (“this feature leads to +5% eligibility”), reducing additional cognitive processing steps. In a user study, participants evaluated the understandability of explanations for different framing and labeling conditions for loan applications and music recommendations. We found that positive framing improves understandability even when the prediction is negative. Additionally, adding semantic labels eliminates any framing effects on understandability, with positive labels outperforming negative labels. We implemented our suggestions in a package ArgueView[11]

    How playlist evaluation compares to track evaluations in music recommender systems

    No full text
    Most recommendation evaluations in music domain are focused on algorithmic performance: how a recommendation algorithm could predict a user's liking of an individual track. However, individual track rating might not fully reflect the user's liking of the whole recommendation list. Previous work has shown that subjective measures such as perceived diversity and familiarity of the recommendations, as well as the peak-end effect can influence the user's overall (holistic) evaluation of the list. In this study, we investigate how individual track evaluation compares to holistic playlist evaluation in music recommender systems, especially how playlist attractiveness is related to individual track rating and other subjective measures (perceived diversity) or objective measures (objective familiarity, peak-end effect and occurrence of good recommendations in the list). We explore this relation using a within-subjects online user experiment, in which recommendations for each condition are generated by different algorithms. We found that individual track ratings can not fully predict playlist evaluations, as other factors such as perceived diversity and recommendation approaches can influence playlist attractiveness to a larger extent. In addition, inclusion of the highest and last track rating (peak-end) is equally good in predicting playlist attractiveness as the inclusion of all track evaluations. Our results imply that it is important to consider which evaluation metric to use when evaluating recommendation approaches
    corecore